如今,随着大数据越来越流行,大数据安全治理也变得越来越火热,介绍下其中数据处理框架Apache的顶级项目Falcon。先以安装为开始。
1. Hadoop配置项的修改
1.1 修改yarn-site.xml
操作机器
在主机-1、主机-2、主机-3节点上使用 hdfs 用户, /var/local/hadoop/hadoop-2.6.0/etc/hadoop 目录下
操作指令
|
|
在
<property>
<name>mapred.jobtracker.taskScheduler</name>
<value>org.apache.hadoop.mapred.FairScheduler</value>
</property>
<property>
<name>mapred.fairscheduler.allocation.file</name>
<value>/var/local/hadoop/hadoop-2.6.0/etc/hadoop/fair-scheduler.xml</value>
</property>
<property>
<name>mapred.fairscheduler.preemption</name>
<value>true</value>
</property>
<property>
<name>mapred.fairscheduler.assignmultiple</name>
<value>true</value>
</property>
<property>
<name>mapred.fairscheduler.poolnameproperty</name>
<value>mapred.job.queue.name</value>
<description>job.set("mapred.job.queue.name",pool); </description>
</property>
<property>
<name>mapred.fairscheduler.preemption.only.log</name>
<value>true</value>
</property>
<property>
<name>mapred.fairscheduler.preemption.interval</name>
<value>15000</value>
</property>
<property>
<name>mapred.queue.names</name>
<value>default,hadoop,hive</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>20960</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>2048</value>
</property>
1.2 添加公平策略fair-scheduler.xml
因为Falcon的运作过程中会涉及大量的MapReduce作业,所以yarn调度需要进行一些处理,来保证他的负载均衡,所以此处我们最好加上公平策略。
操作机器
在主机-1、主机-2、主机-3节点上使用 hdfs 用户, /var/local/hadoop/hadoop-2.6.0/etc/hadoop 目录下
操作指令
vim fair-scheduler.xml
在文件中添加如下内容:
<?xml version="1.0"?>
<allocations>
<pool name="hive">
<minMaps>90</minMaps>
<minReduces>20</minReduces>
<maxRunningJobs>20</maxRunningJobs>
<weight>2.0</weight>
<minSharePreemptionTimeout>30</minSharePreemptionTimeout>
</pool>
<pool name="hadoop">
<minMaps>9</minMaps>
<minReduces>2</minReduces>
<maxRunningJobs>20</maxRunningJobs>
<weight>1.0</weight>
<minSharePreemptionTimeout>30</minSharePreemptionTimeout>
</pool>
<user name="hadoop">
<maxRunningJobs>6</maxRunningJobs>
</user>
<poolMaxJobsDefault>10</poolMaxJobsDefault>
<userMaxJobsDefault>8</userMaxJobsDefault>
<defaultMinSharePreemptionTimeout>600</defaultMinSharePreemptionTimeout>
<fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
</allocations>
指令说明
minResources: 最少资源保证量,格式为 “X mb,Y vcores”. 各队列最少资源保证量之和最好不要超过YARN最大的可使用的内存
maxResources: 最多可以使用资源量,Fair Scheduler会保证每个队列使用的资源量不会超过该队列的最多可使用资源量.
maxRunningApps: 最多同时运行的应用程序数目.
schedulingPolicy: 队列采用的调度模式,支持 fifo,fair,drf
aclSubmitApps: 可向队列中提交应用程序的数目,默认是*.如果需要指定多个用户可以这样hadoopuser hadoopgroup,sparkuser
aclAdministerApps: 该队列的管理员列表,管理员可以杀死队列中的任一个任务
userMaxAppsDefault: 默认的用户最多同时运行应用程序
注:
使用队列时只使用叶子队列
maxRunningApps 参数非常有用需要根据当前集群的可用内存资源来配置。
会因hadoop的版本不同,如果设置不当(值过大),测试短时间内提交多个yarn程序同时运行时Yarn资源迅速用完时,各Job会长时间等待任务的分配。
2. 编译和部署Oozie-4.2.0
oozie是falcon运行过程中所运行的调度器,所以其是falcon正常运行所必须的。
2.1 编译oozie
操作机器
在集群 namenode 节点 主机-1 上使用 hdfs 用户, /home/hdfs 目录下
操作指令
wget http://mirror.bit.edu.cn/apache/oozie/4.2.0/oozie-4.2.0.tar.gz
下载 oozie 源代码
tar zxvf oozie-4.2.0.tar.gz
cd oozie-4.2.0
解压缩 oozie-4.2.0, 进入 oozie源代码目录
bin/mkdistro.sh -DskipTests -Phadoop-2 -Dhadoop.auth.version=2.6.0 -Ddistcp.version=2.6.0 -Dhive.version=1.2.1 -Dsqoop.version=1.4.6
编译oozie-4.2.0,编译完成后在distro/target/ 目录下有oozie程序压缩包 oozie-4.2.0-distro.tar.gz
tar -zxf oozie-4.2.0-distro.tar.gz
mv oozie-4.2.0 /var/local/hadoop/
解压缩oozie-4.2.0并将得到的文件移动到 /var/local/hadoop 目录下
2.2修改hdfs配置
操作机器
在集群namenode节点主机-1上使用hdfs用户。
操作指令
vim /var/local/hadoop/hadoop-2.6.0/etc/hadoop/core-site.xml
修改 hadoop core-site.xml文件,添加如下配置:
<property>
<name>hadoop.proxyuser.hdfs.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hdfs.groups</name>
<value>*</value>
</property>
其中hdfs是用户之后运行oozie的用户名
hdfs dfsadmin -refreshSuperUserGroupsConfiguration
yarn rmadmin -refreshSuperUserGroupsConfiguration
不重启hadoop集群,而使配置生效
2.3添加Oozie lib扩展包
操作机器
在集群namenode节点主机-1上使用hdfs用户。
操作指令
cd /var/local/hadoop/oozie-4.2.0
mkdir libext
tar zxvf oozie-sharelib-4.2.0.tar.gz
在 oozie-4.2.0 新建 libext 文件夹并进入该目录
cp $HADOOP_HOME/share/hadoop/*/*.jar libext/
cp $HADOOP_HOME/share/hadoop/*/lib/*.jar libext/
cp $HIVE_HOME/lib/*.jar libext/
cp share/lib/hcatalog/*.jar libext/
将hadoop的jar包导入oozie libext
cd libext
mv servlet-api-2.5.jar servlet-api-2.5.jar.bak
mv jsp-api-2.1.jar jsp-api-2.1.jar.bak
mv jasper-compiler-5.5.23.jar jasper-compiler-5.5.23.jar.bak
mv jasper-runtime-5.5.23.jar jasper-runtime-5.5.23.jar.bak
把hadoop与tomcat冲突jar包去掉
将附件中ext-2.2.zip上传到$OOZIE_HOME/libext/目录下(此文件请百度自行下载文件)
wget http://mirror.bit.edu.cn/mysql/Downloads/Connector-J/mysql-connector-java-5.1.38.tar.gz
tar zxvf mysql-connector-java-5.1.38.tar.gz
cp mysql-connector-java-5.1.38/mysql-connector-java-5.1.38-bin.jar /var/local/hadoop/oozie-4.2.0/libext
下载mysql驱动包至libext/目录下
2.4添加Oozie配置项
操作机器
在集群namenode节点主机-1上使用hdfs用户
操作指令
cd /var/local/hadoop/oozie-4.2.0
vim conf/oozie-site.xml
修改oozie配置文件,添加如下内容
<property>
<name>oozie.service.JPAService.create.db.schema</name>
<value>true</value>
</property>
<property>
<name>oozie.service.JPAService.jdbc.driver</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>oozie.service.JPAService.jdbc.url</name>
<value>jdbc:mysql://主机-1:3306/oozie?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>oozie.service.JPAService.jdbc.username</name>
<value>oozie</value>
</property>
<property>
<name>oozie.service.JPAService.jdbc.password</name>
<value>oozie</value>
</property>
<property>
<name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
<value>*=/var/local/hadoop/hadoop-2.6.0/etc/hadoop</value>
</property>
<property>
<name>oozie.service.ProxyUserService.proxyuser.hdfs.hosts</name>
<value>*</value>
</property>
<property>
<name>oozie.service.ProxyUserService.proxyuser.hdfs.groups</name>
<value>*</value>
</property>
操作说明
其中oozie.service.HadoopAccessorService.hadoop.configurations项配置目录为$HADOOP_HOME下的etc/hadoop目录,主机-1是运行oozie的节点机子,主机是运行oozie的用户名,与前面hadoop的core-site.xml配置一致
2.5添加mysql用户
操作机器
在集群namenode节点主机-1上使用hdfs用户
操作指令
mysql -uroot -padmin
使用管理员用户进入mysql控制台
create database oozie;
创建名称为oozie的数据库
grant all privileges on oozie.* to 'oozie'@'localhost' identified by 'oozie';
设置oozie数据库的访问权限,创建用户名为oozie,密码为oozie的用户
grant all privileges on oozie.* to 'oozie'@'%' identified by 'oozie';
设置oozie数据库的访问权限
update mysql.user set host=’%’ where user=’root’ and host=’localhost’;
insert into mysql.user (host,user,password) values(‘主机-1’,’oozie’,PASSWORD(‘oozie’));
设置oozie用户的认证权限
FLUSH PRIVILEGES;
quit
退出mysql控制台
重启mysql使配置生效:
sudo service mysqld restart
2.6配置Oozie环境变量
操作机器:
在主机-1,主机-2,主机-3上,使用hdfs用户,任意目录下
操作命令:
sudo vim /etc/profile
主机-1,主机-2,主机-3在文件末尾,添加如下内容:
export OOZIE_HOME=/var/local/hadoop/oozie-4.2.0
export PATH=$PATH:$HADOOP_HOME/bin:$OOZIE_HOME/bin
2.7刷新环境变量
操作机器:
在主机-1,主机-2,主机-3上,使用当前终端,任意目录下
操作命令:
source /etc/profile
2.8部署oozie
操作机器
在集群namenode节点主机-1上使用hdfs用户
操作指令
cd /var/local/hadoop/oozie-4.2.0
bin/oozie-setup.sh prepare-war
打包oozie war包
bin/ooziedb.sh create -sqlfile oozie.sql -run
初始化数据库
vim oozie-server/conf/server.xml
修改服务器端conf/server.xml文件,注释掉下面的记录
bin/oozie-setup.sh sharelib create -fs hdfs://主机-1:9000
将oozie share库中的jar上传至hdfs上
2.9启动并测试oozie
操作机器
在集群namenode节点主机-1上使用hdfs用户
操作指令
jps
显示启动的java服务,如果列表中没有historyserver服务,则输入下列指令启动historyserver:
$HADOOP_HOME/sbin/mr-jobhistory-daemon.sh start historyserver
完成后再次输入jps查看启动结果
cd /var/local/hadoop/oozie-4.2.0
bin/oozied.sh start
启动oozie服务
oozie admin -oozie http://主机-1:11000/oozie -status
检验服务是否正常启动,如果显示System model:Normal启动成功,反之失败
操作说明
用户可以在浏览器输入http://主机-1:11000/oozie/进入oozie web端控制台查看oozie运行状态,其中主机-1为安装并运行oozie的节点ip地址
3. 编译和部署Falcon
3.1编译falcon
操作机器
在集群 namenode 节点 主机-1 上使用 hdfs 用户, /home/hdfs 目录下
操作指令
wget http://mirror.bit.edu.cn/apache/falcon/0.9/apache-falcon-0.9-sources.tar.gz
下载 falcon源码
tar -zxvf apache-falcon-0.9-sources.tar.gz
cd falcon-sources-0.9/
解压缩得到代码文件 falcon-sources-0.9 并进入
export MAVEN_OPTS="-Xmx1024m -XX:MaxPermSize=256m -noverify" && mvn clean install -Dhadoop.version=2.6.0 -Doozie.version=4.2.0 –DskipTests
打包编译falcon源码,如果在编译过程中出现npm error报错,输入下列指令安装npm:
sudo yum -y install npm
安装npm并使用国内镜像
npm --registry https://registry.npm.taobao.org info underscore
mvn clean assembly:assembly -DskipTests -DskipCheck=true
编译完成后在target/ 文件夹下存在apache-falcon-0.9-bin.tar.gz和apache-falcon-0.9-bin.zip压缩包
tar -zxvf target/apache-falcon-0.9-bin.tar.gz
mv falcon-0.9 /var/local/hadoop/
解压缩falcon工程包,并将解压缩得到的falcon-0.9 文件移到/var/local/hadoop 目录下
3.2 修改oozie配置项
操作机器
在集群 namenode 节点 主机-1 上使用 hdfs 用户
操作指令
cd /var/local/hadoop/oozie-4.2.0
vim conf/oozie-site.xml
在oozie的oozie-site.xml配置文件中添加以下内容
<!-- Oozie EL Extension configurations for falcon -->
<property>
<name>oozie.service.ELService.ext.functions.coord-job-submit-instances</name>
<value>
now=org.apache.oozie.extensions.OozieELExtensions#ph1_now_echo,
today=org.apache.oozie.extensions.OozieELExtensions#ph1_today_echo,
yesterday=org.apache.oozie.extensions.OozieELExtensions#ph1_yesterday_echo,
currentWeek=org.apache.oozie.extensions.OozieELExtensions#ph1_currentWeek_echo,
lastWeek=org.apache.oozie.extensions.OozieELExtensions#ph1_lastWeek_echo,
currentMonth=org.apache.oozie.extensions.OozieELExtensions#ph1_currentMonth_echo,
lastMonth=org.apache.oozie.extensions.OozieELExtensions#ph1_lastMonth_echo,
currentYear=org.apache.oozie.extensions.OozieELExtensions#ph1_currentYear_echo,
lastYear=org.apache.oozie.extensions.OozieELExtensions#ph1_lastYear_echo,
formatTime=org.apache.oozie.coord.CoordELFunctions#ph1_coord_formatTime_echo,
latest=org.apache.oozie.coord.CoordELFunctions#ph2_coord_latest_echo,
future=org.apache.oozie.coord.CoordELFunctions#ph2_coord_future_echo
</value>
<description>
EL functions declarations, separated by commas, format is [PREFIX:]NAME=CLASS#METHOD.
This property is a convenience property to add extensions to the built in
executors without having to
include all the built in ones.
</description>
</property>
<property>
<name>oozie.service.ELService.ext.functions.coord-action-create-inst</name>
<value>
now=org.apache.oozie.extensions.OozieELExtensions#ph2_now_inst,
today=org.apache.oozie.extensions.OozieELExtensions#ph2_today_inst,
yesterday=org.apache.oozie.extensions.OozieELExtensions#ph2_yesterday_inst,
currentWeek=org.apache.oozie.extensions.OozieELExtensions#ph2_currentWeek_inst,
lastWeek=org.apache.oozie.extensions.OozieELExtensions#ph2_lastWeek_inst,
currentMonth=org.apache.oozie.extensions.OozieELExtensions#ph2_currentMonth_inst,
lastMonth=org.apache.oozie.extensions.OozieELExtensions#ph2_lastMonth_inst,
currentYear=org.apache.oozie.extensions.OozieELExtensions#ph2_currentYear_inst,
lastYear=org.apache.oozie.extensions.OozieELExtensions#ph2_lastYear_inst,
latest=org.apache.oozie.coord.CoordELFunctions#ph2_coord_latest_echo,
future=org.apache.oozie.coord.CoordELFunctions#ph2_coord_future_echo,
formatTime=org.apache.oozie.coord.CoordELFunctions#ph2_coord_formatTime,
user=org.apache.oozie.coord.CoordELFunctions#coord_user
</value>
<description>
EL functions declarations, separated by commas, format is [PREFIX:]NAME=CLASS#METHOD.
This property is a convenience property to add extensions to the built in
executors without having to
include all the built in ones.
</description>
</property>
<property>
<name>oozie.service.ELService.ext.functions.coord-action-create</name>
<value>
now=org.apache.oozie.extensions.OozieELExtensions#ph2_now,
today=org.apache.oozie.extensions.OozieELExtensions#ph2_today,
yesterday=org.apache.oozie.extensions.OozieELExtensions#ph2_yesterday,
currentWeek=org.apache.oozie.extensions.OozieELExtensions#ph2_currentWeek,
lastWeek=org.apache.oozie.extensions.OozieELExtensions#ph2_lastWeek,
currentMonth=org.apache.oozie.extensions.OozieELExtensions#ph2_currentMonth,
lastMonth=org.apache.oozie.extensions.OozieELExtensions#ph2_lastMonth,
currentYear=org.apache.oozie.extensions.OozieELExtensions#ph2_currentYear,
lastYear=org.apache.oozie.extensions.OozieELExtensions#ph2_lastYear,
latest=org.apache.oozie.coord.CoordELFunctions#ph2_coord_latest_echo,
future=org.apache.oozie.coord.CoordELFunctions#ph2_coord_future_echo,
formatTime=org.apache.oozie.coord.CoordELFunctions#ph2_coord_formatTime,
user=org.apache.oozie.coord.CoordELFunctions#coord_user
</value>
<description>
EL functions declarations, separated by commas, format is [PREFIX:]NAME=CLASS#METHOD.
This property is a convenience property to add extensions to the built in
executors without having to
include all the built in ones.
</description>
</property>
<property>
<name>oozie.service.ELService.ext.functions.coord-job-submit-data</name>
<value>
now=org.apache.oozie.extensions.OozieELExtensions#ph1_now_echo,
today=org.apache.oozie.extensions.OozieELExtensions#ph1_today_echo,
yesterday=org.apache.oozie.extensions.OozieELExtensions#ph1_yesterday_echo,
currentWeek=org.apache.oozie.extensions.OozieELExtensions#ph1_currentWeek_echo,
lastWeek=org.apache.oozie.extensions.OozieELExtensions#ph1_lastWeek_echo,
currentMonth=org.apache.oozie.extensions.OozieELExtensions#ph1_currentMonth_echo,
lastMonth=org.apache.oozie.extensions.OozieELExtensions#ph1_lastMonth_echo,
currentYear=org.apache.oozie.extensions.OozieELExtensions#ph1_currentYear_echo,
lastYear=org.apache.oozie.extensions.OozieELExtensions#ph1_lastYear_echo,
dataIn=org.apache.oozie.extensions.OozieELExtensions#ph1_dataIn_echo,
instanceTime=org.apache.oozie.coord.CoordELFunctions#ph1_coord_nominalTime_echo_wrap,
formatTime=org.apache.oozie.coord.CoordELFunctions#ph1_coord_formatTime_echo,
dateOffset=org.apache.oozie.coord.CoordELFunctions#ph1_coord_dateOffset_echo,
user=org.apache.oozie.coord.CoordELFunctions#coord_user
</value>
<description>
EL constant declarations, separated by commas, format is [PREFIX:]NAME=CLASS#CONSTANT.
This property is a convenience property to add extensions to the built in
executors without having to
include all the built in ones.
</description>
</property>
<property>
<name>oozie.service.ELService.ext.functions.coord-action-start</name>
<value>
now=org.apache.oozie.extensions.OozieELExtensions#ph2_now,
today=org.apache.oozie.extensions.OozieELExtensions#ph2_today,
yesterday=org.apache.oozie.extensions.OozieELExtensions#ph2_yesterday,
currentWeek=org.apache.oozie.extensions.OozieELExtensions#ph2_currentWeek,
lastWeek=org.apache.oozie.extensions.OozieELExtensions#ph2_lastWeek,
currentMonth=org.apache.oozie.extensions.OozieELExtensions#ph2_currentMonth,
lastMonth=org.apache.oozie.extensions.OozieELExtensions#ph2_lastMonth,
currentYear=org.apache.oozie.extensions.OozieELExtensions#ph2_currentYear,
lastYear=org.apache.oozie.extensions.OozieELExtensions#ph2_lastYear,
latest=org.apache.oozie.coord.CoordELFunctions#ph3_coord_latest,
future=org.apache.oozie.coord.CoordELFunctions#ph3_coord_future,
dataIn=org.apache.oozie.extensions.OozieELExtensions#ph3_dataIn,
instanceTime=org.apache.oozie.coord.CoordELFunctions#ph3_coord_nominalTime,
dateOffset=org.apache.oozie.coord.CoordELFunctions#ph3_coord_dateOffset,
formatTime=org.apache.oozie.coord.CoordELFunctions#ph3_coord_formatTime,
user=org.apache.oozie.coord.CoordELFunctions#coord_user
</value>
<description>
EL functions declarations, separated by commas, format is [PREFIX:]NAME=CLASS#METHOD.
This property is a convenience property to add extensions to the built in
executors without having to
include all the built in ones.
</description>
</property>
<property>
<name>oozie.service.ELService.ext.functions.coord-sla-submit</name>
<value>
instanceTime=org.apache.oozie.coord.CoordELFunctions#ph1_coord_nominalTime_echo_fixed,
user=org.apache.oozie.coord.CoordELFunctions#coord_user
</value>
<description>
EL functions declarations, separated by commas, format is [PREFIX:]NAME=CLASS#METHOD.
</description>
</property>
<property>
<name>oozie.service.ELService.ext.functions.coord-sla-create</name>
<value>
instanceTime=org.apache.oozie.coord.CoordELFunctions#ph2_coord_nominalTime,
user=org.apache.oozie.coord.CoordELFunctions#coord_user
</value>
<description>
EL functions declarations, separated by commas, format is [PREFIX:]NAME=CLASS#METHOD.
</description>
</property>
<!-- Required to Notify Falcon on Workflow job status. -->
<property>
<name>oozie.services.ext</name>
<value>
org.apache.oozie.service.JMSAccessorService,
org.apache.oozie.service.JMSTopicService,
org.apache.oozie.service.EventHandlerService
</value>
</property>
<property>
<name>oozie.service.EventHandlerService.event.listeners</name>
<value>
org.apache.oozie.jms.JMSJobEventListener
</value>
</property>
<property>
<name>oozie.jms.producer.connection.properties</name>
<value>
java.naming.factory.initial#org.apache.activemq.jndi.ActiveMQInitialContextFactory;java.naming.provider.url#tcp://主机-1:61616
</value>
</property>
<property>
<name>oozie.service.JMSTopicService.topic.name</name>
<value>
WORKFLOW=ENTITY.TOPIC, COORDINATOR=ENTITY.TOPIC
</value>
<description>
Topic options are ${username} or a fixed string which can be specified as default or for a
particular job type.
For e.g To have a fixed string topic for workflows, coordinators and bundles,
specify in the following comma-separated format: {jobtype1}={some_string1}, {jobtype2}={some_string2}
where job type can be WORKFLOW, COORDINATOR or BUNDLE.
Following example defines topic for workflow job, workflow action, coordinator job, coordinator action,
bundle job and bundle action
WORKFLOW=workflow,
COORDINATOR=coordinator,
BUNDLE=bundle
For jobs with no defined topic, default topic will be ${username}
</description>
</property>
<property>
<name>oozie.service.JMSTopicService.topic.prefix</name>
<value>FALCON.</value>
<description>
This can be used to append a prefix to the topic in oozie.service.JMSTopicService.topic.name. For eg: oozie.
</description>
</property>
操作说明
将$FALCON_HOME/oozie/conf/oozie-site.xml 的配置项内容添加到$OOZIE_HOME/conf/oozie-site.xml 中,其中将oozie 的jms消息连接项oozie.jms.producer.connection.properties里面的通讯地址改为oozie运行的节点,即主机-1。
3.3添加Falcon jar包至oozie库文件
操作机器
在集群 namenode 节点 主机-1 上使用 hdfs 用户
操作指令
cd /var/local/hadoop/oozie-4.2.0
cp /var/local/hadoop/falcon-0.9/oozie/libext/*.jar libext/
将falcon在oozie目录下的扩展jar包拷贝至$OOZIE_HOME/libext文件夹下
bin/oozie-stop.sh
bin/oozie-setup.sh prepare-war
bin/oozie-start.sh
重新部署并启动oozie
3.4Falcon client配置
操作机器
在集群 namenode 节点 主机-1 上使用 hdfs 用户
操作指令
cd /var/local/hadoop/falcon-0.9
vim conf/client.properties
修改client.properties文件中falcon.url的值,将其改为
falcon.url=https://{主机-1}:{port}/
操作说明
falcon.url指定了falcon server的ip地址,在本例中为主机-1的ip地址,port为falcon启动时配置的端口号,默认为15443.
3.5修改Falcon配置文件
操作机器
在集群 namenode 节点 主机-1 上使用 hdfs 用户
操作指令
cd /var/local/hadoop/falcon-0.9
vim conf/startup.properties
将*.broker.url的值改动如下
*.broker.url=tcp/主机-1:61616
操作说明
*.broker.url为Falcon自带activemq消息发送地址,即Falcon运行所在的节点机器,在本例中为主机-1的ip地址。
3.6配置Falcon环境变量
操作机器:
在主机-1,主机-2,主机-3上,使用hdfs用户,任意目录下
操作命令:
sudo vim /etc/profile
#主机-1,主机-2,主机-3在文件末尾,添加如下内容:
export FALCON_HOME=/var/local/hadoop/falcon-0.9
export PATH=$PATH:$HADOOP_HOME/bin:$FALCON_HOME/bin
3.7刷新环境变量
操作机器:
在主机-1,主机-2,主机-3上,使用当前终端,任意目录下
操作命令:
source /etc/profile
3.8创建Falcon client
操作机器
在集群 namenode 节点 主机-1 上使用 root用户
操作指令
useradd -U -m falcon-dashboard -G users
groups falcon-dashboard
显示falcon-dashboard : falcon-dashboard users则创建成功
scp -r /var/local/hadoop/falcon-0.9 主机@主机-client:/home/主机/
将falcon发送至client端
操作说明
传到client上需要输入client端主机用户的密码
3.9启动Falcon
操作机器
在集群 namenode 节点 主机-1 上使用 hdfs 用户
操作指令
cd /var/local/hadoop/falcon-0.9
bin/falcon-start
启动Falcon server
jps
显示java进程,如果列表中有Falconserver则启动成功
操作说明
用户可以在client端通过浏览器输入https://主机-1:15443进入falcon web端控制台。其中注意falcon server使用https协议,如果输入地址为http则显示出错。